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L4 LOOKUP IMPLEMENTATION USING within the CAM where the first match of the key is found 

EFFICIENT CAM ORGANIZATION and is conventionally used as a pointer into a separate RAM 

containing the data to be returned from the lookup. 

„ A In performing routing and switching lookups, the key is 

BACKGROUND OF THE INVENTION 5 typically a combination of the destination address of the 

1. Field of the Invention packet and other parameters describing (for example) the 
" . - . * ... source interface on which the packet was received. The 

Tlic present invention relates to data networking, in return value (the index) can take on several forms, the most 
particular routing and switching systems. commonly used being a new destination address which tells 

2. Description Of The Related Art 10 the switch where to send the packet. 

Many types of routing or switching systems are well The operation of CAMs, and their close cousin the ternary 

known in the data communications arts* In the most abstract CAM (TCAM) in the contex^olrouting lookups are further 

sense, these systems consist of the elements noted in FIG. 1. described in M. Wddvogej^et7)!^alable High Speed IP (frt QJ • -n 

A packet generated in or originating from a source network Routing Lookups. Proceemnp^otuie ACM SIGCOMM'97 J 

110 arrives at a switch device 120. The packet is switched is Conference on Applications, Technologies, Architectures, 

based on its header information to a destination network and Protocols for Computer Communication, Sep. 14-18, 

130. Switch 120 consists, in its most general form, of a 1997, Cannes, France, incorporated herein by reference in its 

destination lookup device 121, a switch fabric or switch entirety. 

mechanism 123, and a rewrite device 127. One drawback long seen in the use of CAMs is that CAMs 
Lookup device 121 examines the source packet header (as 20 themselves have a finite (and often limited) allowable key 
found, for example, in Internet Protocol [IP] packets) for width. That is, CAMs are typically built to a certain maxi- 
information indicating the next device that is to receive mum key width of, for example, 32 or 64 bits. As routing 
and/or process the packet. That next device information is information complexity increases with the addition of 
extracted, by means also well known in the art, and typically protocol, access control, and queuing/buffering control 
used to access some manner of a lookup table or tree 25 parameters to the lookup key, the number of bits needed in 
structure to determine the appropriate routing information. the key rapidly exceeds the maximum width of the CAM. 
The routing information is then used to forward the packet This problem has previously been solved by performing 
through switch fabric 123 and out to the destination network parallel lookups in a plurality of CAMs, thus supplying an 
130. Such routing decisions can be made at Layer 3 (13) or effectively wider CAM through banking. However such a 
Layer 4 (L4) of the well-known OSI reference model. 30 solution is very expensive to implement in that CAM 
Switch fabric 123 is conventionally provided by any of a devices are costly in terms of price, power requirements, and 
number of well-known blocking or non-blocking switching thermal dissipation. Also, the time delays involved in form- 
systems used in the art and is not further discussed herein. m % multiple keys, perforating lookups in multiple physical 
Rewrite device 127, which is often (but not always) devices, and concatenating or otherwise processing multiple 
included in switch 120, physically changes the packet to put 35 i6sxAt& can P 0Vldc 40 »a*»eptable Pack<* slowdown in 
new destination and source information into the packet hi S h ^ routin g systems, 
header immediately prior to sending the packet out to the . Accordingly, what is needed is a way to rapidly access a 
destination network. In some cases, rewrite device 127 singje CAM or other lookup device with a very wide key and 
might also do a secondary lookup to determine other routing rctum quickly and efficiently, 
information to help speed the packet on its way or to provide 40 SUMMARY 
packet multicast. As is well-known in the art, multicasting j. i ^ . L 

• • L- L1 • j ■ . . . content aoMrcssableTiemory (CAM). The method uses a 

HG.lKalugMysiinpiified^^ sequence of wide key lookups in a single CAM and provides 

and routers n use today. Additional elements, such as faster and more efficient use of CAM space. Multiple CAM 

buffering and queuing systems, statistics collection, devices and/or multiple slow lookups arc avoided. 

advanced access control and other protocol-dependent iWin*Wnt« „f 7h. .„„.„,;„„ ., „„ . . 

mechanisms have been omitted for clarity. kJ?^5T£ ^ F .invention may be used in 

u*»uo ua uw,u uiuiucu lui w«ny. both content addressable memories, ternary CAMs, or other 
Lookup device 121, in particular, presents a number of ^ variations on a content addressable memory device with 

interesting opportunities for optimization of the lookup facility 

function. Optoization * considered a long-felt need in the Continuing series of lookups, such as those required to 

switctung ana routing industry because of tnc continuing process me tonE 

streams of packets provided in modern data 

push for higher switch throughput and greater price- communications systems, can be performed in either a 

performance improvements. ss or pipelined fashion , fl a sc qncnli z\ lookup 

As noted above, the lookup function essentially consists arrangement, each of the several lookups necessary to fully 

of extracting certain elements of the packet header and switch a packet are performed one after another. In a 

forming a tuple of those elements (by concatenation, for pipelined arrangement (in steady state), the first lookup for 

example) to use as a lookup key. This lookup key is then a new packet is performed at the same time as at least one 

presented to a search engine, which is known to take on $o of the subsequent lookups required for packets received 

many forms in the art. The search engine performs its search immediately beforehand. In other words, in a system rcquir- 

looking for matches to the key and returning the information ing (for example) three lookups per packet, the first lookup 

corresponding to that key. fo r packet N performed at the same time as the second 

One method of doing such a lookup known in the art is to lookup for packet N-l, and the third lookup for packet N-2, 

use a content addressable memory (CAM) as part of the 65 where N is a packet received at a given time interval and 

search engine. A CAM is a memory device which returns an packets N-x are packets received in the x** lime intervals 

index when presented with the key. The index is the address immediately proceeding packet N. 
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In another alternate embodiment, packet lookup FIG. 2 is a high-level flow chart of the method according 
sequences on packets arriving on multiple ports of a single to one embodiment of the present invention, 

device may be carried in a time division multiplexed (TDM) FIG. 3 is a high-level functional block diagram of an 
arrangement. In such an arrangement, a given window of exemplary CAM according to one embodiment of the 
time (e.g., window a) in the search engine is set aside for all s present invention. 

of the lookups from port A. 'lime window b is set aside for FIG. 4 is a high-level functional block diagram of a 
i processing the lookups necessary to service- packets arriving pipelined lookup sequence according to an alternate embodi- 

pOnl * C (\uf\ on port B, and so on for each mpu( jx>rt Eacfr) bf these time nient of the present invention. 

windows may be sized so that all of trie necessary lookups The use of the same reference symbols in different draw- 
in a sequence for a single packet are performed. Alternately, 10 ings indicates similar or identical items, 
each time window may be sized such that packet lookup DETAILED DESCRIPTION 
sequences for more than one packet received at each des- Process Flow 

ignated port are processed within that window. FIG. 2 shows a flow chart of the method 200 of one 

The speed and efficient packing of the data tables within embodiment of the present invention. In step 201, the packet 

the content addressable memory space is provided by the i$ is received and the header is read by means well known in 

intelligent selection of packet sub-fields for lookup and the the art. In next step 210, the 8-bit protocol field located in the 

order in which the lookups are performed. Optimization is Internet Protocol (IP) header is read and used to perform a 

also provided by ordering the tables within the CAM in such direct lookup in the Protocol Table. This lookup, which 

a way that the shortest and least frequently changed tables determines a pointer to the root tree for a particular server 

are at the highest end of the CAM (i.e., the end of the CAM 20 flow associated with the inbound packet, returns the value of 

address space closest to address 0x0000 OOOOh) and com- the Server Flow Table root tree pointer sFlowRTP. 

pressed more tightly in the CAM space. Step 215 checks to determine if the sFlowRTP value is 

In one embodiment on the present invention, the lookup valid - J^is check, in one embodiment of the present 

sequence for a single packet is broken into three levels. First* invention, is nothing more than a simple "not equal to zero" 

a 5-tuple key (consisting of the 104 bits needed for a Layer ^ffl^ ^ test » ^ to determine if a valid (non-zero) TlfTSC • S>UCY\ 

4 lookup) is divided into three subkeys, one for each level. sHowRj p pointer has been returned. If there is no sFlow- 

The first subkey consists solely of an 8-bit protocol field. RTP pointer, i.e., sFlowRTP is equal to zero, Layer 4 

This key is used to index a small, 256 entry Protocol Table switching is aborted and conventional Layer 3 switching is 

within the CAM to directly read a corresponding Server performed by means well known in the art at step 289. 

Flow Table root tree pointer (sFlowRTP). 30 If tnc sBowRTP is a valid number, then step 220 performs 

Next, the server information tunjejsused as the lookup information lookup using the server IP address 

/^-l0( r f" key. The 32-bit IP address and uVJ^o^crvcr port number and server port number fields from the packet header. To 

° U (forming the server information mpEpTused as the key to looku P> me ""^ 19 address < a 32 -° u number > 

a Server Flow Table in the CAM to look up the Client Flow me saw port number (16-bit number) are concatenated 

Table root tree pointer (cFlowRTP). The bemnning (root) of 35 the lookup key m a CAM lookup. The value 

the Server Flow Table is identified by the protocol lookup: returne 1 d * J°° ku P 15 the 1001 trce P ointer 10 lDe 

the Server Flow Table root tree pointer defines the first Ghent Flow Table containing the client flow 

addressof the Server Flow Table in the CAM to be used for rou ?« designated cFlowRTP. Choosing to 

the server information lookup perform a server-only lookup helps to optimize Layer 4 

Finally, client information lookup is performed using the 40 2^ ^ 

32-bit client IP address and the 16-bit cEent port number as toW™^*£*»«P P ro ^f? <*«*™ ™ the art 

.. . . , .. . , . , , ^7. , . This is so because the server IP address and server port 

the client information tuple. As before, the results of the last . . ,. , . .... , „ . , 

lookup (in this case, the Client Flow Table root tree pointer) tog ^ f defi ? e « ■J"**** *°. W ™ of 

Flow Tabic within the CAM to be used for L lookup. The * ***** ° n ff^ >° formation that does not 

results of the client information lookup define the MAC J«£ -«*»"» t ifC , S& 2S 

^ . * . r ,. x r ^_ , , , number of servers ana then* port information does not 

rewrite information and (optionally) any network address . dvnainicallv as the number of clients and the 
translation (NAT) information for Layer 4 switching of this S^g, *? d y^m^Uy as the numtttr of clients and the 
particular nacket mmvidual packet streams from individual clients. Generally 
w i • i « " ™ . — • ^ „ so speaking, the constellation of servers present in a network is 
MulUpleServerFlowandCu^tno^ relatively static By first identifying and isolating the pro- 
m the lookup memory space. The number of Server Flow cessing <Lc., the switching lookups) based on server flow 
Tab cs is determined by the number of discrete Server Flow idcntifying information, process 200 simplifies the switch- 
Table root tree pointers listed m the Protocol Table. mg decisions necessary to fully route the packet. 
Similarly, Jhc mimber of Oient Row Tables is defined by the S5 nG . 3 shows a high-level functional block diagram of an 
number of discrete Client Flow Table root tree pointers in all exemplary CAM 300, according to one embodiment of the 
of the Server Flow Tables. Both of these table spaces (i.e„ prescnt invention. CAM 300 is configured to perform the 
the set of Server Flow Tables and the set of Client Flow m f orm ation lookup of step 220 above. Lookup key 
Tables) arc managed by system operator programming m fe formcd by concatenating server IP address 302 with 
(configuration) of the search engine. ro sxxyttt ^ number 304. Bits [63:48 ] ^ reserved for other 

BRIEF DESCRIPTION OF THE DRAWINGS ^/^^ a J sim ?} C n™*™*™* » ™<* c 

CAM is illustrated wuT realize that other CAM widths 

The present disclosure may be better understood and the and/or concatenations can be used. Accordingly, the inveu- 

numcrous features and advantages made apparent to those tion is not limited to any particular CAM width or concat- 

skilled in the art by referencing the accompanying drawings. 6$ e nation scheme.) 

FIG. 1 is a high-level functional schematic of a router/ In accordance with conventional CAM usage, lookup key 

switch system. 305 is bitwise ORed with all entries 310 of CAM 300. 



US 6,862,281 Bl 
5 6 

Matching entry 320 at address W N" thus provides, in some Furthermore, the type of lookup memory used is not 

embodiments, a pointer to a secondary RAM structure strictly limited to a content addressable memory. Equally 

containing the corresponding cFlowRTP. In an alternate usable is the well-known ternary CAM (TCAM) as well as 

embodiment, address "N" can be used as the cFlowRTP other fast memory table lookup structures seen in the art 

directly. ' 5 today. 

A similarly configured CAM, either within another CAM Pipelined vs. lime Sequential Lookups 

device or a region within the same CAM and defined by While the switching lookup method 200 discussed above 

selective masking (as that practice is known in the CAM is presented in a simple sequential ordering, one of ordinary 

arts), is used for the client information lookup described skill in the art will see that such a multi-step lookup function 

below. 10 ^ readily adaptable to pipelined operation as well. 

Next, in step 225, cFlowRTP is tested to be certain that it Pipelining, as that term is known in the art, refers to the 

is a non-null value. If cFlowRTP is equal to aero, the Layer practice of performing all of the steps in a multiple step 

4 switching lookup process will abort at step 289. If, sequence at the same time, but on different work objects. In 

however, cFlowRIP is valid, that value is used to point to the the context of packet switching, pipelining often refers to the 

bcgirimng (root) of the appropriate client flow lookup table. 15 notion of performing the various steps of switching each 

This table is "appropriate* in the sense that it is a table packet in parallel on a number of packets equal to the 

constructed (by conventional means) based on the routing number of steps in the switching process. For example, in a 

information identified by the network for packets from the 3-step switching lookup such as that of method 200 

particular server flow identified in the previous lookup step described above, the first packet received begins processing 

220. 20 in the first of the three steps. The next packet received is 

lhe client information lookup is then performed at step processed in the first step (here step 210) while the first 

230. Here, the client IP address and the client port number packet received is processed in the second step, step 220. 

are used to determine the ultimate Layer 4 flow entry values The third packet received is processed in step 210 at the 

unique to the combination of IP protocol, server flow, and same time the second packet is in step 220 and the third 

client flow of the inbound packet 25 packet is in step 230. By the time the fourth packet arrives 

In one embodiment of the present invention, final lookup at the pipeline, the first packet to arrive has exited the 

step 230 returns the MAC rewrite information necessary to pipeline and the second packet to arrive is ready to undergo 

provide Layer 4 destination information for the packet's the very last step. 

next hop. That information is tested in step 233 for validity. In some embodiments of the present invention, method 
If that information is non-zero (i.e., there is a valid MAC 30 200 is carried out in a pipelined fashion on the continuous 
rewrite value returned) the packet is Layer 4 switched by stream of packets received at the switch. In other embodi- 
conventional means in step 299. If, as discussed above, the ments of the present invention, each packet is handled alone 
final flow entry is invalid (i.e., equal to zero) L4 switching and the processing/switching of the next packet received 
is aborted and L3 switching is carried out in step 289. waits until all lookups are performed for the first packet 
In an alternate embodiment of the present invention, 35 received. Clearly, a pipelining sequence has extra complexi- 
lookup step 230 also returns network address translation ties which may well be offcel by the throughput improvc- 
(NAT) information that remaps the source and/or destination ments often realized in pipelined systems. As is presently 
address of the switched packet to comply with any one of a conceived, however, the single stream, non-pipelined pro- 
number of means of network address translation schemes cessing is believed to be the simplest and most effective at 
well known in the art. In fact, since the set of client flow 40 this time, because it can be performed at wire speed, without 
tables provided in the CAM can be of almost any size introducing additional packet latency, and with the simplest 
(limited, of course, by the size of the CAM), any number of implementation cost. 

different and/or additional data fields can be suppUed by the FIG. 4 illustrates, at a high level of abstraction, the 

last lookup in order to implement different Layer 4 switching operation of a pipelined lookup apparatus 400 in schematic 
and/or NAT schemes. As one of ordinary skill in the art 45 form. A stream of packets 410, represented by their header 

would be well versed with different methods of IA switching information HcadcrO, Headcrl, Header2, Header3, etc. enter 

and remapping schemes, the present invention will be under- sFlowRTP Lookup Engine 420. In a pipelined system 

stood to comprise all such variations as known in the art according to one embodiment of the present invention 

today illustrated here, at time 0, packet Header© has just been 
The above discussion refers to the various server flow and so processed in sFlowRTP Ii>okup Engine 420, Header-1 (i.e^, 

client flow root pointer tables simply as tables existing in the header of the packet immediately preceding the HeaderO 

CAMs. One of ordinary skill in the art will readily see, packet) has just been processed in cHowKIY Lookup 

however, that a number of different table organizations are Engine 430 and Header-2 has just been processed in^nfo 

equally applicable to CAM type lookups. In particular, the Lookup Engine 440. Table 470 identifies which packet 
well-known Patricia tree is familiar in the art as providing an 55 header gets processed in each engine (or "stage," as the 

even more efficient access structure for CAM type lookups. pipeline elements are commonly called) after each time 

Other tree structures, themselves were all known in the art, interval. Thus, after the next time interval (Time-1), 

are also usable in a CAM type lookup. For example, any of Headcrl will have been processed by sFlowRTP bxjkup 

a number of the high speed IP routing lookup schemes Engine 420, HeaderO will have been processed by cFlow- 
discussed in Waldvogel, et al. f (cited above) may also be 60 RTP Lookup Engine 430, and Header-! will have been 

used. Patricia trees are discussed in further detail in D. R. processed by L4 Info Lookup Engine 440. 

Morrison, PATRICIA —Practical Algorithm to Retrieve Tht operation of each engine is as follows: in sFlowRTP 

Information Coded in Alphanumeric, Journal of the ACM, Lookup Engine 420, the 8-bit protocol field is used to index 

Vol. 15, no. 4, pp. 514-534 (October 1968) and G. H. Protocol Table 425, returning sFlowRTP. Trie sFlowRTP 
Gonnet,' Handbook of Algorithms and Data Structures, pp. 65 value (and, in some embodiments, the current header) are 

109 (1984), incorporated herein by reference in their entire- men passed to cFlowRTP lookup Engine 430 in the next 



ties. 



time interval. 
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The cFlowRTP Lookup Engine 430 generates its lookup On the other hand, if the system operator determines that 

key as described above and performs the lookup in Server a miss rate of x/2 percent is required, then the corresponding 

Flow CAM 435, returning lookup result cRowRTP. The number of server and client flow entries could approxi- 

cFlowKTP (and, in some embodiments, the current header) mately double. In such a sanario, CAM size must neces- 

arc then passe, to L4 Info lookup Engine 440 in the next 5 ^^^^^ lookup 

U U^fp Engine 440 form, i* l.kup key .d^bed ^^^^ 

^^^^V^T- ^Tp ^T^Sl S well knownin thlvarious a£s that employ CAM and similar 
CAM 445. The result returned is L4 Rewrite data 450, which j mcmory structures to cascade such devices length- 
passes out of the pipeline in the next time mterval. 10 ^ In a ^g^wke cascade, the lookup key is applied 

Time Division Multiplexing simultaneously to multiple physical CAM banks having 

Afurther alternative scheme for employing the method .of ^dth. M external priority encoder is used to 

the present invention involves using time division multi- matches f rom the banks in order to yield the 

plexing (TDM) for switching lookups. With TDM, multiple sfa . ^ match y^^^ ibc vnscaX invention is not 
streams of packets arc processed in a timesharing scheme, is Umiled to a CAM structure comprising a single physical 
The need for a TDM scheme can arise when a switch device device Rather> ^ prcscnt invention may include a CAM 
receives more than one stream of packet data, for instance, ^ naion prov ided by multiple banks of physical devices in 
when it is configured to receive packet streams on multiple a k^^^dc configuration, as such configuration is 
input ports. One well-known method of processing multiple in mc ^ today 

packet streams is simply to replicate the processing eguip- 20 Mtfi Tnitc Embodiments 

ment with a dedicated set of equipment for each iiroutfporf) ^ order m which ^ steps of thc prcscnt invention 
This, however, is very expensive and results in an arithmetic melhod m performed is purely illustrative in nature. In fact, 
cally increased use of resources and real estate as the number fc stcps can be performed m any order or in parallel, unless 
of ports grows. otherwise indicated by the present disclosure. 

It is preferable, therefore, to reduce the amount of equip- 25 ^ method of the present invention may be performed in 
ment and real estate needed to perform multi-port/muh> eithcf bardwaiCj software, or any combination thereof as 
stream packet switching by using only a smgk switching terms are ^venUy known in the art In particular, the 

lookup subsystem to process packets from all ports. The nt mcthod raay ^ out by software, firmware, or 

method of the present invention also provides an answer to microcode operating on a computer or computers of any 
this problem. Because the system and method of the prcscnt 30 Additionally, software embodying the present inven- 
invention can run at extremely high speeds, and because the tion may ^mptisc computer instructions in any form (e.g., 
sequenced table lookup utilizes the input port number, there interpreted code, etc.) stored in 

is no inherent limitation to the use of the present invention computer-readable medium (e.g., ROM, RAM, mag- 

in multi-port switch devices. The input port number, which netic mcdia> punchc d tape or card, compact disk [CD] in any 
is the client port number for packets received from a chent 35 formj DVD> ctc .). Furthermore, such software may also be 
or the server port number for packets received from a server, m ^ form of a compter data signal embodied in a carrier 
is part of the switching decision process. Thus, the only ^ ^ mat found ^hin mc wcU-known Web pages 

constraint on the overall process devolves to simply having transferred among devices connected to thc Internet, 
enough table space to support the variety of server and client Accordingly, the present invention is not limited to any 
flows desired by the system operator, just as in a single port 40 particular platform unless specifically stated otherwise in the 
implementation. present disclosure. 

CAM si2in g While particular embodiments of the present invention 

The key factor in determining CAM size is the number of faave ^ amJ described, it will be apparent to those 
flow entries thc system operator wishes to maintain in the in ^ ^ mal changes and modifications may be 

various tables contained within thc CAM. If the operator 45 ma<fc wimout departing from this invention in its broader 
wishes to minimize CPU calls and table rewrites (which are ^ mcrcfore> mc appended claims arc to encompass 

the well-known consequences of not finding a route after a ^ ^ aU ^ cnanges and modifications as fall 

lookup), then the CAM size must necessarily grow to within the true spirit of this invention, 
include as many routes as possible. If, however, the system j c j aml; 

operator (or other person responsible for configuring the 50 ^ A ^V, method for use in a switch, comprising thc 
router/switch system) determines a certain miss rate in the q£ 

CAM tables is acceptable, then the CAM size can be paismg a packet header to obtain a plurality of data fields; 
reduced. Such reduction in CAM ^ A °™^™^ ^ at least one of said plurality of data fields as a 
perhaps,su^adesigno^ ™oi id e n tifier and using said protocol identifier to 

determined analytically and imp emeuted through means 55 P£°? ServerFlow Table start pointer 

well-known in thc art and currently in use today. r^WHTFV 

For example, if the system operator were to determine that V £ , , . . „ DTD * 

a miss rate of x per^ntage were acceptable, where * P**™"* * ^ l ° oku P ^d sRowRTT and usu« 
percentage implies Eat onlTlOO server flows and 10,000 one or more of said <^ fidd. » ^ ^ to 

client flows were required, then the CAM needs only be 60 f**™™ a Chcnl FloW Table SUft P ° mler 
sized to hold the 10,100 flow tables (each sized to hold the (cHowKTP); 

appropriate number of entries). Note, of course, that the performing a second lookup usmgjaKi cHowfflV and 

present invention requires space in the CAM or other lookup using one or more of said data fields as a second key to 

memory structure that actually holds three tables: the pro- obtain a flow entry; and 

tocol lookup table, the server flow table, and the client flow 65 switching said packet using said flow entry. 

table. As noted above, thc protocol table is generally quite 2. The method of claim 1, wherein said first lookup or said 

small and in one embodiment requires only 256 entries. second lookup is performed in a Patricia tree. 
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3. The method of claim 1, wherein said first lookup or said performing a second lookup using said cFlowRTP and 
second lookup utilizes a content addressable memory. using one or more of said data fields as a second key to 

4. The method of claim 3, wherein said content addres- obtain a flow entry; and 
sable memory comprises a ternary content addressable switching said packet using said flow entry, 
memory. 5 16. The computer-readable medium of claim 15, wherein 

5. The method of claim 1, wherein said direct access, said sM first lookup or ^ second lookup is performed in a 
first lookup, and said second lookup arc pipelined. Patricia tree 

6 Hie DKfltnd of claim 1, wherein for packets arriving at ^ ^ ter-reada51e medium of claim 15, wherein 

a prorahty of ports at least sud selecting, said performing * utilizcs a 

said first lookup, and said performing said second lookup arc * r 

time divisiou multiplexed io addressable memory. fT^X,^ •« ^Ao ,^1 /7 

7. Themethod of claim 1, further comprising: 18. The computer-readable medium i rfclaim 10>herein G(CUm \[ 
validating said sFlowRTP prior to performing said first s * d e mcmory f^-«mFnaes a lcr " 

lookurr nary content addressable mcmory. 

«• • . . ™ dtt> — ^ c -^t>a 19. The computer-readable medium of claim 15, wherein 

15 said * rect aco ^' ^ fir5t tooknp ' ^ ^ sccond lookup 

valk^ ^2oSSmputer-readable medium of claim 15, wherein 

8. A computer switching system, comprising computer for packets arriving at a phirality of ports, at least said 
instructions for selecting, said performing said first lookup, and said per- 

parsmgapacketheadertoobtainapluraUtyofdaafields; 20 fanning said second lookup are time division multiplexed, 

selecting at least one of said plurality of data fields as a 21. 'Hie computer-readable medium of claim 15, further 

protocol identifier and using said protocol identifier to comprising: 

directly access a Server Flow Table start pointer validating said sFlowRTP prior to performing said first 

(sFlowRTP); lookup; 

performing a first lookup using said sFlowRTP and using 25 validating said cFlowRTP prior to performing said second 

one or more of said data fields as a first key to lookup; and 

gjj 1 Clicnt Flow Tablc slart P° intcr validatiiigsaidflowentrypriortoperformingsaidswi^ 

performing a second lookup using said cFlowRTP and 22*A computer data signal embodied in a carrier wave, 

using one or more of said data fields as a second key to c^p^r instructions for 

obtain a flow entry; and paismg a packet header to obtain a plurafity of data fields; 

switching said packet using said flow entry. . . , ^ , f. /+ . c 

9. The switching system of claim 8, wherein said first selecting at least one of said plurality of data fields as a 
lookup or said second lookup is performed in a Patricia tree. Protocol identifier and using said protocol identifier to 

10. The switching system of claim 8, wherein said first 35 Meetly access a FIow Table start P° ,nter 
lookup or said second lookup utilizes a content addressable (sFlowRTP); 

memory. performing a first lookup using said sFlowRTP and using 

11. The switching system of claim 10, wherein said one or more of said data fields as a first key to 
content addressable memory comprises a ternary content determine a Client Flow Table start pointer 
addressable mcmory. 40 (cFlowRTP); 

12. The switching system of claim 8, wherein said direct performing a second lookup using said cFlowRTP and 
access, said first lookup, and said second lookup are pipe- using one or more of said data fields as a second key to 
lined. obtain a flow entry; and 

13. The switching system of claim 8, wherein for packets ^ switching said packet using said flow entry, 
arriving at a plurality of ports, at least said selecting, said 23. The computer data signal of claim 22, wherein said 
performing said first lookup, and said performing said ^ lookup or said second lookup is performed in a Patricia 
second lookup are time division multiplexed. tree. 

14. The switching system of claim 8, further comprising: 24. The computer data signal of claim 22, wherein said 
validating said sFlowRTP prior to performing said first SQ first lookup or said second lookup utilizcs a content addres- 

lookup; sable memory, 

validatiug said cFlowRTP prior to performing said sccond 25. The computer data sig nal oLcla im 24, wherein said , j it /yffy { 

lookup; and content addressable memory flB^ comprises a ternary QCl£££- TK 

validating said flow entry prior to performing said switch- content addressable memory. 

m g 55 26. The computer data signal of claim 22, wherein said 

15. A computer-readable medium storing a computer direct access, said first lookup, and said second lookup are 
program executable by a computer, the computer program pipelined. 

comprising computer instructions for: 27. The computer data signal of claim 22, wherein for 

parsing a packet header to obtain a plurality of data fields; packets arriving at a plurality of ports, at least said selecting 

selecting at least one of said plurafity of data fields as a 60 said performing said first lookup, and said performing said 

protocol identifier and using said protocol identifier to secondlookup arc tune division mulUplexed. 

directly access a Server Flow Table start pointer 28- ™ c computer data signal of claim 22, further com- 

(sFlowRTP); ^ . 

performing a first lookup using said sFlowRTP and using validating said sFlowRTP prior to performing said first 

one or more of said data fields as a first key to 65 lookup; 

determine a Client Flow Table start pointer validating said cFlowRTPprior to performing saidsecond 

(cFlowRTP); lookup; and 
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validating said flow entry prior to performing said switch- 
ing. 

29. A computer switching system comprising: 
means for parsing a packet header to obtain a plurality of 

data fields; 

means for selecting at least one of said plurality of data 
Melds as a protocol identifier and using said protocol 
identifier to directly accessing a Server Flow Table start 
pointer (sFlowRTP); 

means for performing a first lookup using said sFlowRTP 
and using one or more of said data fields as a first key 
to determine a Client Flow Table start pointer 
(cFlowRTP); 

means for performing a second lookup using said cFlow- 
RTP and using one or more of said data fields as a 
second key to obtain a flow entry; and 

means for switching said packet using said flow entry. 

30. The method of claim 29, wherein said means for 
performing said first lookup or said second lookup comprise 
a Patricia tree. 

31. The method of claim 29, wherein said means for 
performing said first lookup or said second lookup comprise 
a content addressable memory. 

32. The method of claim 29, wherein said means for 25 
directly accessing and said means for performing said first 
lookup and said second lookup are pipelined. 
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33. The method of claim 29, wherein for packets arriving 
at a plurality of ports, at least said means for selecting, said 
means for performing said first lookup, and said means for 
performing said second lookup are time division multi- 
plexed. 

34. The method of claim 29, further comprising: 
means foxvalidating said sFlowRTP prior to performing 

sai(tffist)ookup; 
means for validating said cFlowRTP prior to performing 

said second lookup; and 
means for validating said flow entry prior to performing 

said switching. 

35. A lookup method for use in a switch, comprising the 
steps o£ 

parsing a packet header to obtain a plurality of data fields; 

selecting at least one of said plurality of data fields as a 
protocol identifier; 

performing a server flow lookup using said protocol 
identifier and one or more of said data fields to deter- 
mine said Client Flow Table start pointer (cFlowRTP); 

performing a second lookup using said cFlowRTP and 
using one or more of said data fields as a second key to 
obtain a flow entry; and 

switching said packet using said flow entry. 
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[0008] As noted above, the lookup function essentially consists of extracting certain 
elements of the packet header and forming a tuple of those elements (by concatenation, 
for example) to use as a lookup key. This lookup key is then presented to a search 
engine, which is known to take on many forms in the art. The search engine performs its 
search looking for matches to the key and returning the information corresponding to that 



[0009] One method of doing such a lookup known in the art is to use a content 

addressable memory (CAM) as part of the search engine. A CAM is a memory device 

which returns an index when presented with the key. The index is the address within the I 

CAM where the first match of the key is found and is conventionally used as a pointer 

into a separate RAM containing the data to be returned from the lookup. ! 

[0010] In performing routing and switching lookups, the key is typically a 
combination of the destination address of the packet and other parameters describing (for 
example) the source interface on which the packet was received. The return value (the 

index) can take on several forms, the most commonly used being a new destination \ 

\ 

address which tells the switch where to send the packet 

[001 1 ] The operation of C AMs, and their close cousin the ternary CAM (TCAM) in 



key. 



the context of routing lookups are further described in M. Waldvoge] 




Scalable 



High Speed IP Routing Lookups. Proceedings of the ACM SIGCOMM '97 Conference 



on Applications, Technologies, Architectures, and Protocols for Computer 



Communication, September 14-18, 1997, Cannes, France, incorporated herein by 



reference in its entirety. 
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10012] One drawback long seen in the use of CAMs is that CAMs themselves have a 
finite (and often limited) allowable key width. That is, CAMs are typically built to a 
certain maximum key width of, for example, 32 or 64 bits. As routing information 
complexity increases with the addition of protocol, access control, and queuing/buffering 
control parameters to the lookup key, the number of bits needed in the key rapidly 
exceeds the maximum width of the CAM. This problem has previously been solved by 
performing parallel lookups in aplurality of CAMs, thus supplying an effectively wider 
CAM through banking. However such a solution is very expensive to implement in that 
CAM devices are costly in terms of price, power requirements, and thermal dissipation. 
Also, the time delays involved in forming multiple keys, performing lookups in multiple 
physical devices, and concatenating or otherwise processing multiple results can provide 
an unacceptable packet slowdown in high speed switching and routing systems. 

[0013] Accordingly, what is needed is a way to rapidly access a single CAM or other 
lookup device with a very wide key and return lookup results quickly and efficiently. 

SUMMARY 

[0014] Presently disclosed is a method for high-speed, segmented Layer ^^^ /) 
lookups in a specially organized content addressable memory (CAM). The method uses a 
sequence of wide key lookups in a single CAM and provides fester and more efficient use 
of CAM space. Multiple CAM devices and/or multiple slow lookups are avoided. 

[0015] Embodiments of the present invention may be used in both content 
addressable memories, ternary CAMs, or other variations on a content addressable 
memory device with equal facility. 
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[0016] Continuing series of lookups, such as those required to process the long 
streams of packets provided in modem data communications systems, can be performed 
in either a sequential or pipelined fashion. In a sequential lookup arrangement, each of 
the several lookups necessary to fully switch a packet are performed one after another. In 
a pipelined arrangement (in steady state), the first lookup for a new packet is performed at 
the same time as at least one of the subsequent lookups required for packets received 
immediately beforehand, brother words, in a system requiring (for example) three 
lookups per packet, the first lookup for packet N is performed at the same time as the 
second lookup for packet N-l, and the third lookup for packet N-2, where N is a packet 
received at a given time interval and packets N-jc are packets received in the time 
intervals immediately proceeding packet N. 

(0017] In another alternate embodiment, packet lookup sequences on packets arriving 
on multiple ports of a single device may be earned in a time division multiplexed (TDM) 
arrangement. In such an arrangement, a given window of time (e.g., window a) in the 
search engine is set aside for all of the lookups from port A. Time window b is set aside 
for processing the lookups necessary to service packets arriving on port B, and so on for 
each input^dirt Each)>f these time windows may be sized so that all of the necessary 



may be sized such that packet lookup sequences for more than one packet received at 
each designated port are processed within that window. 

[001 8] The speed and efficient packing of the data tables within the content 
addressable memory space is provided by the intelligent selection of packet sub-fields for 
lookup and the order in which the lookups are performed. Optimization is also provided 




lookups in a sequence for a single packet are performed. Alternately, each time window 
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by ordering the tables within the CAM in such a way that the shortest and least frequently 
changed tables are at the highest end of the CAM (i.e., the end of the CAM address space 
closest to address 0x0000 OOOOh) and compressed more tightly in the CAM space. 

[0019] In one embodiment on the present invention, the lookup sequence for a single 

packet is broken into three levels. First, a 5-tuple key (consisting of the 104 bits needed 

for a Layer 4 lookup) is divided into three subkeys, one for each level. The first subkey 

consists solely of an 8-bit protocol field. This key is used to index a small, 256 entry ? v 

Protocol Table within the CAM to directly read a corresponding Server Flow Table root 

tree pointer (sFlowRTP). 

[0020] Next, the server information tuple is used as the lookup key. The 32-bit IP ; 

address and th^-b^ierver port number (forming the server information tuple) are used j 

as the key to a Server Flow Table in the CAM to look up the Client Flow Table root tree j 

pointer (cFlowRTP). The beginning (root) of the Server Flow Table is identified by the 

protocol lookup: the Server Flow Table root tree pointer defines the first address of the 

Server Flow Table in the CAM to be used for the server information lookup. j 

[0021] Finally, client information lookup is performed using the 32-bit client IP 

address and the 16-bit client port number as the client information tuple. As before, the 

results of the last lookup (in this case, the Client Flow Table root tree pointer) are used to 

define the starting address of the particular Client Flow Table within the CAM to be used j 

for this lookup. The results of the client information lookup define the MAC rewrite ; 

information and (optionally) any network address translation (NAT) information for 

Layer 4 switching of this particular packet 
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DETAILED DESCRIPTION 



Process Flow 

10029] Fig. 2 shows a flow chart of the method 200 of one embodiment of the present 
invention. In step 201, the packet is received and the header is read by means well 
known in the art. In next step 210, the 8-bit protocol field located in the Internet Protocol 
(IP) header is read and used to perform a direct lookup in the Protocol Table. This 
lookup, which determines a pointer to the root tree for a particular server flow associated 
with the inbound packet, returns the value of the Server Flow Table root tree pointer 
sFlowRTP. 

[0030] Step 215 checks to determine if the sFlowRTP value is valid. This check, in 
one embodiment of the present invention, is nothing more than a simple "not equal to 
zero'^^Ts^^ test is used to determine if a valid (non-zero) sFlowRTP pointer has 
been returned. If there is no sFlowRTP pointer, i.e., sFlowRTP is equal to zero, Layer 4 
switching is aborted and conventional Layer 3 switching is performed by means well 
known in the art at step 289. 

[0031] If the sFlowRTP is a valid number, then step 220 performs the server 
information lookup using the server BP address and server port number fields from the 
packet header. To perform this lookup, the server IP address (a 32-bit number) and the 
server port number (16-bit number) are concatenated and used as the lookup key in a 
CAM lookup. The value returned by this lookup is the root tree pointer to the particular 
Client Flow Table containing the client flow routing information, designated cFlowRTP. 
Choosing to perform a server-only lookup helps to optimize Layer 4 switching and its 
associated lookups by simplifying the longest prefix match lookup problem so often seen 
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wire speed, without introducing additional packet latency, and with the simplest 
implementation cost 

[0043] Fig. 4 illustrates, at a high level of abstraction, the operation of a pipelined 
lookup apparatus 400 in schematic fonn. A stream of packets 410, represented by their 
header information HeaderO, Headerl , Header2, Header3, etc. enter sFlowRTP Lookup 
Engine 420. In a pipelined system according to one embodiment of the present invention 
illustrated here, at time 0, packet HeadeiO has just been processed in sFlowRTP Lookup 
Engine 420. Header-1 (i.e., the header of the packet immediately preceding the HeaderO 
packet) has just been processed in cFlowRTP Lookup Engine 430 and Header-2 has just 
been processed iiM4)nfo Lookup Engine 440. Table 470 identifies which packet header 



after each time interval. Thus, after the next time interval (Time « 1), Headerl will have 
been processed by sFlowRTP Lookup Engine 420, HeaderO will have been processed by 
cFlowRTP Lookup Engine 430, and Header-1 will have been processed by L4 Info 
Lookup Engine 440. 



[0044] The operation of each engine is as follows: in sFlowRTP Lookup Engine 420, 
the 8-bit protocol field is used to index Protocol Table 425, returning sFlowRTP. The 
sFlowRTP value (and, in some embodiments, the current header) are then passed to 
cFlowRTP Lookup Engine 430 in the next time interval. 

[0045] The cFlowRTP Lookup Engine 430 generates its lookup key as described 
above and performs the lookup in Server Flow CAM 435, returning lookup result 
cFlowRTP. The cFlowRTP (and, in some embodiments, the cuirent header) are then 
passed to L4 Info Lookup Engine 440 in the next time interval. 




gets processed in each engine (or "stage," as the pipeline elements are commonly called) 



-13- 



727572 vl 

Client Reference: Seq. 291 J/CPOL 79461 



i- 



1 fm P<*W£ hw\\G*kwv ffUd S'10-lOOj 

1 r ' ' 1 11 AttSiey Docket Na: M-9750US 



[00461 L4 Lookup Engine 440 forms its lookup key as described above and performs 
the client flow lookup in Client Flow CAM 445. The result returned is L4 Rewrite data 
450, which passes out of the pipeline in the next time interval. 

Time Division Multiplexing 

[00471 A further alternative scheme for employing the method of the present 
invention involves using time division multiplexing (TDM) for switching lookups. With 
TDM, multiple streams of packets are processed in a time-sharing scheme. The need for 
a TDM scheme can arise when a switch device receives more than one stream of packet 
data, for instance, when it is configured to receive packet streams on multiple input ports. 
One well-known method of processing multiple packet streams is simply to replicate the 
processing equipment with a dedicated set of equipment for each mpu{^rt.yThis, 



however, is very expensive and results in an arithmetically increased use of resources and 
real estate as the number of ports grows. 

[0048] It is preferable, therefore, to reduce the amount of equipment and real estate 
needed to perform multi-port/multi-stream packet switching by using only a single 
switching lookup subsystem to process packets from all ports. The method of the present 
invention also provides an answer to this problem. Because the system and method of the 
present invention can run at extremely high speeds, and because the sequenced table 
lookup utilizes the input port number, there is no inherent limitation to the use of the 
present invention in multi-port switch devices. The input port number, which is the client 
port number for packets received from a client or the server port number for packets 
received from a server, is part of the switching decision process. Thus, the only 
constraint on the overall process devolves to simply having enough table space to support 
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In re Chandrasekaran et al., Application No. 09/854,013 
Amendment A Pursuant TO 37 CFR § 1.312 

Claim 17 (original): The computer-readable medium of Claim 15, wherein said first 
lookup or said second lookup utilizes a content addressable memory. 



Claim 18 (currently amended): The computer-readable medium of Claim 1 




wherein said content addressable memory comprises a ternary content addressable memory. 

Claim 19 (original): The computer-readable medium of Claim 15, wherein said direct 
access, said first lookup, and said second lookup are pipelined. 

Claim 20 (original): The computer-readable medium of Claim 15, wherein for packets 
arriving at a plurality of ports, at least said selecting, said performing said first lookup, and 
said performing said second lookup are time division multiplexed. 

Claim 21 (original): The computer-readable medium of Claim 15, further comprising: 
validating said sFIowRTP prior to performing said first lookup; 
validating said cFlowRTP prior to performing said second lookup; and 
validating said flow entry prior to performing said switching. 

Claim 22 (original): A computer data signal embodied in a carrier wave, comprising 
computer instructions for: 

parsing a packet header to obtain a plurality of data fields; 

selecting at least one of said plurality of data fields as a protocol identifier and using 
said protocol identifier to directly access a Server Flow Table start pointer (sFIowRTP); 

performing a first lookup using said sFIowRTP and using one or more of said data 
fields as a first key to determine a Client Flow Table start pointer (cFlowRTP); 

performing a second lookup using said cFlowRTP and using one or more of said data 
fields as a second key to obtain a flow entry; and 

switching said packet using said flow entry. 
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Application/ Control Number: 09/854,013 p age 2 

Art Unit: 2661 

1. An examiner's amendment to the record appears below. Should 
the changes and/or additions be unacceptable to applicant, an 
amendment may be filed as provided by 37 CFR 1.312. To ensure 
consideration of such an amendment, it MUST be submitted no 
later than the payment of the issue fee. 

Authorization for this examiner's amendment was given in a 
telephone interview with Kirk Williams on 11/04/04. 

The application has been amended as follows: 
IN THE CLAIMS: 

Claims 4, line 1, delete the term "further"; 

Claim 11, line 2, delete the term "further" ; 

Claim 11, line 1, replace "Claim 11" with "Claim 10"; 

Claim 18, line 2 delete the term "further"; 



vCf^ im 25 ' line 2 ' delete the term "further", 



Claim 25, line 1, replace "Claim 27" with "claim 24". 



REASONS FOR ALLOWANCE 

2. The following is an Examiner's statement of reasons for 
allowance: Claims 1-35 are considered allowable, as set forth i 
previous office action. 

3. Any comments considered necessary by Applicant must be 
submitted no later than the payment of the issue fee and, to 
avoid processing delays, should preferably accompany the issue 



31. The method of Claim 29, wherein said means for performing said first lookup or 
said second lookup comprise a content addressable memory. 

32. The method of Claim 29, wherein said means for directly accessing and said 
means for performing said first lookup and said second lookup are pipelined. 

5 33. The method of Claim 29, wherein for packets arriving at a plurality of ports, at 
least said means for selecting, said means for performing said first lookup, and said 
means for performing said second lookup are time division multiplexed. 

34. The method of Claim 29, further comprising: 

means for validating said sFlowRTP prior to performing sai|||pbokup; 
10 means for validating said cFlowRTP prior to performing said second lookup; and 

means for validating said flow entry prior to perfoiming said switching. 



35. A lookup method for use in a switch, comprising the steps of: 
parsing a packet header to obtain a plurality of data fields; 
15 selecting at least one of said plurality of data fields as a protocol identifier; 

performing a server flow lookup using said protocol identifier and one or more of 
said data fields to determine said Client Flow Table start pointer (cFlowRTP); 

performing a second lookup using said cFlowRTP and using one or more of said 
data fields as a second key to obtain a flow entry, and 
20 switching said packet using said flow entry. 
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